Skip to content

Enable DFLASH support for additional model backends#22358

Merged
hnyls2002 merged 1 commit intosgl-project:mainfrom
mmangkad-dev:enable-dflash-model-support
Apr 9, 2026
Merged

Enable DFLASH support for additional model backends#22358
hnyls2002 merged 1 commit intosgl-project:mainfrom
mmangkad-dev:enable-dflash-model-support

Conversation

@mmangkad
Copy link
Copy Markdown
Contributor

@mmangkad mmangkad commented Apr 8, 2026

Summary

Enable DFLASH for additional supported models from the z-lab collection: https://huggingface.co/collections/z-lab/dflash

Based on #20547, landing this early to enable support for these models now without waiting for the DFlash spec v2 to merge

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces DFLASH auxiliary hidden state capture across several models, including DeepSeek-V2, GPT-OSS, Kimi-K25, and the Qwen3 series. The changes primarily involve adding set_dflash_layers_to_capture and get_input_embeddings methods. Review feedback highlights several issues: a potential AttributeError and unsafe tuple unpacking in qwen3_vl.py, as well as inconsistencies in qwen3_5.py regarding layer index offsets, pipeline parallelism validation, and return type logic.

@mmangkad
Copy link
Copy Markdown
Contributor Author

mmangkad commented Apr 9, 2026

cc @hnyls2002, could you help check? I can confirm this works for all the available models

@hnyls2002
Copy link
Copy Markdown
Collaborator

/tag-and-rerun-ci

@github-actions github-actions bot added the run-ci label Apr 9, 2026
@hnyls2002 hnyls2002 merged commit c3833ba into sgl-project:main Apr 9, 2026
124 of 180 checks passed
@mmangkad mmangkad deleted the enable-dflash-model-support branch April 9, 2026 21:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants